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Abstract 

We develop a new method for tagging jets produced by hadronically decaying top quarks. The 
43 method is an application of shower deconstruction, a maximum information approach that was 

^ previously applied to identifying jets produced by Higgs bosons that decay to bb. We tag an 

observed jet as a top jet based on a cut on a calculated variable \ that is an approximation to the 
ratio of the likelihood that a top jet would have the structure of the observed jet to the likelihood 
that a non-top QCD jet would have this structure. We find that the shower deconstruction based 
tagger can perform better in discriminating boosted top quark jets from QCD jets than other 
publicly available tagging algorithms. 
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I. INTRODUCTION 



A generic problem of some importance at hadron colliders like the Large Hadron Collider 
(LHC) is to find events generated by a signal process of interest among events generated by 
less interesting background processes. For this purpose, one often looks for events with one 
or more jets containing the decay products of a heavy particle that has been produced with 
a transverse momentum that is substantially larger than its mass, so that the sought decay 
products are part of a visible jet pQ. An important example is looking for jets that contain 
the decay products of a hadronically decaying top quark. One wants to distinguish top jets 
from the more numerous ordinary QCD jets that do not contain the decay products of a top 
quark. Experience shows that the analysis of jet substructure is useful for this purpose [2]. 

Using jet substructure, one wants to be able to tag a jet with a label t such that a jet 
with t = top is likely to be a top jet and a jet with t = other is not so likely to be a top 
jet. Several such top tagging algorithms are available |5HTT]. Somewhat more generally, one 
would like to be able to assign a real variable x to a jet such that a large value of \ indicates 
a jet that is likely to be a top jet and a small value of % indicates a jet that is unlikely to be 
a top jet. Then a top/other tag can correspond to a cut on x, but the cut can be adjusted 
at will to increase or decrease the fraction of top jets that pass the cut while correspondingly 
decreasing or increasing the fraction of background jets that pass the cut. 

In Ref. [12J, we described a method called shower deconstruction to distinguish signal 
jets from background jets. We applied the method to jets containing the decay products of 
a Higgs boson decay to b + b. In this simple example, we found that shower deconstruction 
worked well enough to perform better than the Butterworth-Davison-Rubin-Salam (BDRS) 
method [13] in accomplishing the same end. In this paper, we extend the shower decon- 
struction method to finding top quark jets. This case includes richer physics: a) the top 
quark can decay but until it decays it can emit gluons and b) one of the daughter particles, 
the W boson, itself decays. 

With this richer physics to work with, one might expect that shower deconstruction 
would do well compared to presently existing methods. To find out, we compute results 
for background fake rate versus signal acceptance obtained with shower deconstruction and 
compare to the results of existing top taggers. 

Our plan is as follows. In Sec. [TT], we very briefly describe the general ideas of shower 

we 



deconstruction, referring to Ref. [12] for a fuller explanation of the method. In Sec. Ill 
describe in more detail the nature of a parton shower with decays and especially with decays 
of strongly interacting particles and with more than one level of decays. We concentrate 
on the physical principles and the main formulas and leave some details to an appendix |A} 
Then in sections |I V[ |V| and | VI[ we study the tagging performance of shower deconstruction, 



varying the boost of the possible top jet and the cone size used to define it. In Sec. VII 



we explain how shower deconstruction could be used to measure a parameter of the signal 
theory, namely the W mass. Finally, in Sec. |VIH[ we offer some conclusions. 



II. SHOWER DECONSTRUCTION 



We seek to distinguish a jet that contains the decay products of a hadronically decaying 
top quark from a jet produced by ordinary QCD processes that do not involve a top quark. 
The jet to be examined is presumed to have a large transverse momentum, several hundred 
GeV. It is constructed with a standard jet algorithm, such as the Cambridge-Aachen algo- 
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rithm [T3], using a large effective cone size so as to have a good chance of capturing the 
decay products of a top quark within the jet. This is the "fat jet." 

We group the contents of the fat jet into narrow subjets, which we call microjets. In an 
experimental implementation, the microjets would be constructed directly from information 
on the energy deposits in the calorimeter and tracker, using as fine an angular resolution as 
is practical. 

The computational time needed to analyze an event increases quite quickly with the 
number of microjets. However, we find that the lowest transverse momentum microjets 
carry little useful information. Accordingly, we choose a number iV max with default value 
JVmax = 9 and discard the lowest transverse momentum microjets if there are more than 
iV max microjets, keeping the iV max microjets that have the highest transverse momenta. 
Additionally, we discard microjets with p™ lcro < p^mSi-i with default value p^mS = 5 GeV. 

This process gives the fine grained information with which we describe the fat jet in 
shower deconstruction: the four-momenta {p}n = {pi,P2, ■ ■ ■ ,Pn} of the microjets. From 
these variables, we wish to construct a function x({p}n) with the property that large x 
corresponds to a high likelihood that the jet is a top jet. In fact, we define x as the 
likelihood ratio 

where P({p}n\S) is the probability density that a jet in a sample of top jets ("signal jets") 
would have the configuration {p}n and P({p}n\B) is the probability density that a jet 
in a sample of background jets would have the configuration {p}n- One might imagine 
constructing P({p}at|S) and P({p}tv|B) by generating events with a trusted parton shower 
Monte Carlo event generator. However, that method is not practical. Instead, we calculate 
P({p}at|S) and P({p}at|B) by calculating the probabilities that a simplified approximation 
to a shower Monte Carlo event generator would generate {p}n according to the signal 
hypothesis and the background hypothesis, respectively. Our simplified approximation to a 
shower Monte Carlo event generator is based on the shower algorithms described in Refs. [T5T 
[T£] and in unpublished work in this ongoing series of papers [T9"] . 

How can one calculate these probabilities? Consider, for example, P({p}jv|S). We take 
{p}n to be the momenta carried by partons at a fairly late stage of a parton shower. In 
Fig. [TJ we show a possible shower history by which an event generator might generate a 
particular {p}n- A top quark is created in a hard interaction, indicated by the star in the 
figure. In this shower history, the top quark emits a gluon. Then it decays into a W and a 
b quark. The b quark emits a gluon. The W decays to two light quarks. Meanwhile, initial 
state splittings, depicted by diamond vertices, create two gluons. After two QCD final state 
splittings, the two gluons have become four. Our shower model is simplified. Really, there 
are two incoming partons, "a" and "b," that initiate the hard interaction. However, we 
do not distinguish which incoming partons split to create new partons. Also, we take the 
partons created by initial state splittings to be gluons. 

We should emphasize that not all partons in the event are represented in the shower 
history for the fat jet. One could depict a shower history for a whole event, but any parton 
in the complete shower history that does not have at least one descendant in the fat jet is 
left out of the shower history for the jet. 

Now, given the shower history depicted in Fig. [TJ we assign a splitting probability or a 
decay probability to each vertex. The splitting probabilities are approximately the splitting 
probabilities that are used in parton shower event generators. They take into account 
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FIG. 1: Shower history for a top quark jet. The hard interaction is indicated by a star. Initial 
state emissions are indicated by diamonds. Parton decays are indicated by large filled circles and 
QCD splittings are indicated by small filled circles. 



FIG. 2: Shower history for a QCD jet. 



information on color flow in the event history. The decay probabilities are approximately 
the decay probabilities that would be used in an event generator. Each propagator in the 
shower history corresponds to a Sudakov factor that gives, approximately, the probability 
not to have had a splitting between one vertex and the next or between the last vertex 
and the end of the shower. Thus, for a given shower history corresponding to the signal 
hypothesis, we calculate a probability density that that shower history would have produced 
the observed state {p}n- 

There are many shower histories that could lead to a given {p}n- We sum the corre- 
sponding probabilities over all possible shower histories to calculate P({p}n\S). 

For the background hypothesis, we have different sorts of shower histories. One is shown 
in Fig. [2j Again, we calculate the approximate probability density that the shower history 
would have produced the observed state {p}n- Then we sum the corresponding probabilities 
over all possible shower histories to calculate P({p}tv|B). 

Of course, this brief description leaves out a lot of details. Most of them are presented in 
Ref. p2]- Because they are of some importance to the structure of the model, we reiterate 



in Sec. II A some specifics of the kinematics and the choice of shower time. Then, in Sec. |III[ 
we address some issues that arise with particles that decay, particularly with particles that 
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carry color and decay. 



A. Kinematics and choice of shower time 

Each parton in a shower history carries a label. We denote the momentum of parton i 
by pi. The absolute value of its transverse momentum is kf, its rapidity is yi, its azimuthal 
angle around the beam axis is <f)f, and its virtuality is //? = pf — mf. 

In this study, we take gluons, light quarks, and b quarks to have mass zero. This is not 
right for b quarks, but it should be a reasonable approximation as long as the b quark has a 
transverse momentum ki with k{ ^> m^. In the signal process, we also have top quarks and 
a W boson. These have a mass m t and mw respectively. 

For shower deconstruction, the momentum pj of a mother parton is related to the mo- 
menta pa and ps of the daughter partons by pj = Pa + Pb- This is different from what an 
ordinary parton shower event generator does. In an ordinary parton shower event generator, 
Pa is the sum of the momenta of its daughter particles, but modified to put it on shell, so 
p\ = for a massless parton. This modification is an approximation, imposed because the 
generator does not "know" what p\ is at the time that parton A is generated in J — > A + B. 
Thus the best that the generator can do is to put parton A on shell. Then when parton A 
splits the event generator "finds out" what p\ should be and takes the needed extra mo- 
mentum from somewhere else in the event. For shower deconstruction, however, we do not 
need to make this approximation and, in fact, the relation pj = Pa+Pb is quite convenient. 

In each splitting function, there is a factor where /ij is the jet virtuality, defined by 

»j= (Pa+Pb) 2 ~m 2 j . (2) 

Here mj is the top quark mass in the case that J represents a top quark and otherwise 
mj = 0. In calculating (pa + Pb) 2 , we do not approximate pa and ps as being on shell. 

Parton splittings in the shower are ordered from hard to soft. Consider the splitting of 
parton J with momentum pj and absolute value of transverse momentum kj. Following 
[19], this splitting is assigned a shower time t according to 

e_t = \ik ■ (3) 

The shower splittings are ordered in order of increasing t. Note that we divide by the 
transverse momentum of the mother parton. In order to make exp(— t) dimensionless, we 
also divide by a reference scale \Qq\ on the order of the momentum transfer in the hard 
scattering that initiates the fat jet. 

In the case of a parton decay J — >■ A + B rather than a splitting, we assign to the decay 
the shower time 



\Pj ~ m 2 j\ 
\Qo\kj 



(4) 



The only difference here is that pj can be less than rrij, so we use the absolute value of 



2 2 
Pj ~ m J- 
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III. DECAYING PARTICLES 



The shower histories for the signal process considered have three stages, the first arising 
from the creation of the top quark, the others arising from the decays of the top quark 
and then of the W boson created in the top quark decay. The description of these stages 
is implicit in parton showers generally. Specifically, we follow unpublished work [19] in the 
series pMEE]. 

In the first stage, a top quark is created in a hard process. We look for a high transverse 
momentum jet that contains the top quark. If we are to have a top quark jet, the top quark 
transverse momentum k t must be larger than the top mass m t . We may place cuts that 
require k t ^> m t . Thus in a parton shower picture the top quark has a potentially large 
virtuality to start with and can radiate gluons. This gives what we can call shower I: the top 
quark can radiate one or more gluons and create a full parton shower as the gluons split. In 
this first shower, radiation from the initial state partons can also occur and create partons 
with angles that place them as part of the fat jet. 

For this first shower, we need splitting functions for quarks and gluons other than the 
top quark, possibly with the top quark serving as color connected partner. Thus we allow 
a large mass for the color connected partner. We also need a splitting function for the top 
quark, this time with a massless color connected partner. 

The first shower is suspended when the top quark decays. This happens at a varying 
shower time corresponding roughly to |/^| = \p\ — m\\ ~ m t T t . With our definition of 
shower time, the first shower is suspended at shower time 



Now a second shower, shower II, is created by the decay t — > b + W. The b quark can 
emit a gluon, initiating a shower. There is a minimum value for the starting shower time, 
to 1 , for shower II. This is determined by the maximum of 



Here J is the bottom quark just after the decay and is the virtuality in the splitting of the 
bottom quark. Let us look at this using +, — , _L momentum components 1 in a frame in which 
the top quark before the decay has large + momentum, much larger than the top mass, and 
zero transverse momentum. In this frame, the top quark momentum is approximately 



<5o|e 



-ti _ 



\p 2 t -m 2 t \ m t r t 



(5) 



h k t 




(7) 



The momentum of the bottom quark is 




(8) 



1 We use v 



± 



(v°±v 3 )/V2. 
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Here kj = zk t . The momentum of the W boson is 



Momentum conservation for the — component of momentum gives 



ml _fi 2 j + n 2 + m^ + K 2 ^ 



2y/2k t 2y/2zk t 2y/2(l - z)k, 

Thus 



2 2 2 2 



kj k t (1 - z)k t z(l - z)k t 
This is maximized for k, = and then for z — 0. This gives 



(11) 



Thus shower II starts at the starting splitting scale 

IQ.le-.S- = <^<L . (13 ) 

Of course, this calculation has assumed that the top quark and the W boson are on shell. 
This is not exactly true in a real shower event, but it should be an adequate approximation. 

In shower II, the bottom quark and its descendants can emit gluons, which can either be 
collinear to the mother parton or soft. One can also have initial state radiation of gluons: 
the top quark is the initial state parton whose decay starts shower II and it can radiate 
gluons just before the decay. Now, in shower II, the virtuality of a splitting is never large 
compared to m\. For that reason, there is never an approximate collinear singularity for 
gluon emission collinear with the top quark. However, there is a singularity corresponding 
to soft gluon emission. Recall that one can think of soft gluons as being emitted from color 
dipoles. Thus, in shower II, a soft gluon can be emitted from a dipole consisting of the top 
quark just before the decay and the bottom quark or one if its daughter partons. Normally, 
one would partition the splitting function for gluon emission from such a dipole into two 
terms, as we do for other dipoles. One term would correspond to gluon emission from the 
top quark and the other would correspond to gluon emission from the bottom quark or its 
daughter parton. However, it will be simpler for us not partition emissions from this dipole. 
We simply treat the gluon emissions kinematically as coming from the bottom quark or its 
descendants, with a splitting function that accounts for graphs in which the gluon is soft 
and connects with the top quark in the eikonal approximation. 

Shower II is suspended at a splitting time corresponding to the W boson decay. This 
happens roughly when \p 2 N — m% N \ ~ m-wTw- With our definition of shower time, the second 
shower is suspended at shower time around 

iQole"* ~ ^ • (14) 
few 

Now a third shower is created by the decay W — > q + q. Either of the new quarks can 
emit a gluon, initiating a shower. Shower III starts at the starting splitting scale 



|Qo|e *° = 



(15) 
w 
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(The derivation of this follows the derivation above for the start of shower II.) 

What happens to the "suspended" showers? Let us suppose that t 1 > t 2 . Then the second 
shower is suspended before it reaches shower time t\. Now we start the third shower. When 
the third shower reaches shower time t 2 , the partons in the third shower are splitting on 
a slow enough time scale that their splittings can interfere with splittings from the second 
shower. Thus we continue both of these showers together. We now have the possibility 
that partons in shower II can have partons in shower III as color connected partners and 
vice versa. However, this doesn't happen because the W boson carries no color, so that 
the partons in shower III are in any case color connected only to each other. Now the 
combined showers II and III continue until they reach shower time ti. Then the partons in 
the combined showers II and III are splitting on a slow enough time scale that their splittings 
can interfere with splittings from shower I. Thus we continue all three showers together. We 
now have the possibility that any parton can be color connected to any other parton. The 
complete shower evolves until the end of showering. The method for restarting suspended 
showers is analogous if t 2 > t\. 

We see that a parton shower that properly accounts for interference effects within the 
leading color approximation will reset color connected partners when one of the subshowers 
reaches the shower time at which a parent shower was suspended. This procedure will affect 
wide angle splittings at rather small virtualities. We expect that the effect of reseting color 
connections will not be numerically very significant. Thus in shower deconstruction in this 
paper, we omit the step of reseting color connections in this way. 

If the original top quark has high enough transverse momentum, more top quarks can 
be created within shower I: the top quark can emit a gluon and the gluon can split into 
a t-t pair. This can happen more than once. Each t or t thus created evolves until it is 
nearly on shell. Then each decays to b + W + or b + W~, creating a new independent bottom 
quark sub-shower. Each W also decays, creating a new independent subshower if it decays 
hadronically. At later stages, all of the subshowers rejoin. In the situation that we consider 
in this paper, the top quark transverse momentum is not large enough for these effects to 
be important, so we simply ignore the possibility of multiple t-t creation. 

IV. RESULTS FOR MODERATELY BOOSTED TOP JET 

A. Generating events 

In order to test how well shower deconstruction works for finding top quark jets, we 
generate signal tt and background dijet samples using standard QCD processes in Pythia 
8 [20] and HERWIG++ [21] . We remove the invisible particles from the fully hadronized 
final state and use the remaining particles with \y\ < 5.0 as input for the Cambridge-Aachen 
jet-finding algorithm [T3] as implemented in Fastjet [22] with R = 1.0. To accept an event 
we require at least two jets with pxj > 500 GeV each. We then analyzed the two jets with 
the highest ptj- 

We analyze each fat jet using shower deconstruction. Additionally, we independently tag 
each of the jets as a top quark jet or not using four different taggers: the Johns Hopkins 
tagger [5], the CMS tagger [7], the HEPTop Tagger [8], the NSubjettiness tagger (TUj. These 
taggers take as input the individual hadrons that make up the fat jet. Shower deconstruc- 
tion aims to take the finite resolution of the detector into account by operating on small 
reconstructed jets instead of hadrons. We call the small jets microjets. To construct the 
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-10 -5 5 10 
logx 

FIG. 3: 0-/N) dN/d log \ f° r signal events (upper curve) and (1/N) dN/dlogx for background 
events (lower curve) for samples of signal and background events generated by Pythia. We use 
the cuts described in Sec. IIV Al 

microjets, we use the /c T -algorithm [23] with R = 0.2. 

B. Parameters for top taggers 

For shower deconstruction, we remove microjets from the analysis unless p™ lcro > 5.0 
GeV. If more than nine microjets are left, we remove those with the lowest Pt values until 
nine microjets remain. 

Each of the top taggers other than shower deconstruction constructs a top mass and 
a W mass for each jet that meets the structural criteria of the tagger. Each tagger then 
requires that these reconstructed masses be in specified windows. We specify that a top is 
correctly reconstructed in a window of 172.3 ±25.0 GeV and a W in a window of 80.4 ± 10.0 
GeV, except for the NSubjettiness tagger where 160.0 < m t < 240 GeV and r 3 /r 2 < 0.6 as 
recommended in |10j . 

Top tagging based on shower deconstruction uses the full decay matrix element including 
a Breit-Wigner factor to assign a weight for a given microjet configuration. Thus, the 
total widths of the top quark and the W boson are input parameters. However, because the 
physical widths are very small, we assume that the invariant mass of a set of microjets cannot 
be resolved at the level of the physical widths. To take these experimental uncertainties into 
account, we use values for the top width and the W width equal to half of the corresponding 
total mass window, i. e. T t — > 25 GeV and Tw — > 10 GeV. 

Other parameters for shower deconstruction are as in Ref. [12] . 

C. Distributions versus \ 

Using shower deconstruction, for each fat jet in the event, we calculate \. About 32% 
of signal jets have % = because the shower deconstruction algorithm cannot find a shower 
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history that matches the signal hypothesis within the cuts that are built into the algorithm. 
This represents a failing suggesting that the algorithm is overly strict. However, 68% of 
the signal jets remain. This number can be increased by increasing the top and W mass 
windows. The distribution of the logx values for logx > — 10 is shown in the upper curve 
in figure [3} The bin with with x = is not displayed. The distribution is normalized to 
the total number of generated signal jets, so that the integral under the curve including the 
X = bin is 1 and the integral above logx = —10 is about 0.68. 

Similarly, for each generated background jet, we calculate x- About 86% of these jets 
have x = because the shower deconstruction algorithm cannot find a shower history that 
matches the signal hypothesis within the cuts that are built into the algorithm. That is, 
most of the background jets do not look at all like signal jets. About 14% of the background 
jets remain. The distribution of the logx values for logx > — 10 is shown in the lower curve 
in Fig. [3} The distribution is normalized to the total number of generated background jets, 
so that the integral under the curve including the x = bin is 1 and the integral above 
logx = —10 is about 0.12. 

The idea of shower deconstruction is that the distribution in log x for signal jets should 
be very different from the distribution for background jets. We see in Fig. [3] that this is the 
case. First of all, most of the background jets have logx = — oo and are not visible in the 
graph. Second, few background jets have x > ^ ■ On the other hand, signal jets frequently 
have x ~ e 4 . 



D. Discriminating signal from background 

The simplest way to make use of the differing x distributions between signal jets and 
background jets is to tag the fat jet as top or other according to whether x is greater than 
or less than some fixed value Xcut- With such a cut, some fraction A of the signal jets will be 
correctly labeled as top jets. One calls A the signal acceptance (or the tagging efficiency). 
Correspondingly, some fraction F of the background jets will be incorrectly labeled as top 
jets. One calls F the background fake rate (or the misstag rate). We want A to be big 
and F to be small. If we lower Xcut, we make A bigger, but unfortunately F gets bigger at 
the same time. We can make F smaller by raising Xcut, but then A gets smaller. This is 
illustrated in Fig. [4} 

There are a number of available algorithms for tagging top jets. We compare shower 



deconstruction with the publicly available taggers mentioned in Sec. IV A We have used 
each of these in turn to tag the jets in our signal and background event samples, using 
the default parameters of the algorithm. For a specific choice of parameters the tagging 
performance can be characterized by one point on the signal acceptance vs. background fake 
rate plane. We have plotted the corresponding points in Fig. |4| Notice that we use fixed 
windows in top mass and W mass and use the default parameters for each tagger. Then 
each tagger appears as a point in Fig. |1J See Ref. |24j for graphs in which the mass windows 
and input parameters are varied. 

Using only fixed m t and m w windows and the default input parameters, there is no defi- 
nite answer to the question of which top tagger does the best job because each has a different 
signal acceptance. One might favor the HEP tagger over the JH tagger because the HEP 
tagger has a higher signal acceptance or might favor the JH tagger over the HEP tagger 
because the JH tagger has a lower background fake rate. Nevertheless, for any given signal 
acceptance A, a lower background fake rate F is best. The ratio of the background fake 
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signal acceptance signal acceptance 



FIG. 4: Background fake rate F as a function of signal acceptance A for shower deconstruction 
with the signal and background event samples described in Sec. IV A The curve for shower decon- 
struction is compared to F vs A points for the Johns Hopkins top tagger (JH), the top tagger of the 
CMS group (CMS), the Heidelberg-Eugene-Paris top tagger (HEP), and the use of N-subjettiness 
as a top tagger (NSUB). We show the results on a linear scale (left) and on a logarithmic scale 
(right). 



rate F JH for the JH tagger to the background fake rate F sd (A JH ) from shower deconstruc- 
tion at the same signal acceptance A m as given by the JH tagger is about 3.6. Similarly, 
F C Ms/F sd (A C Ms) « 2.7, F nEP / F sd (A REP ) « 2.6, and FNguB/^sd^NSUP,) ~ 2.4. For this 
reason, one may regard shower deconstruction as doing better than any of the previously 
available top taggers. The right plot of Fig. [4] shows the results on a logarithmic scale. With 
this plot, it is easier to see that one can gain a lot in making the background fake rate 
smaller if one is willing to sacrifice signal acceptance. For instance, with a signal acceptance 
of 0.1 one can reduce the background fake rate to about 5 x 10~ 4 . 



E. Results with HerwigH — |- 

The results presented above were based on signal and background events generated 
with Pythia. One may wonder whether these results are sensitive to which Monte Carlo 
event generator is used to generate events. To answer this question, we repeated the 
analysis using events generated with Herwig+ + . We find that with HERWIG+ + sig- 
nal events, (1/N) dN/d log x in the region x > is about 8% larger than with Pythia 
events, while (1/N) dN/d log % for background events generated with HERWIG++ is close 
to (1/N) dN/dlogx for background events generated with Pythia. This leads to very sim- 
ilar results for the background fake rate as a function of signal acceptance. We display this 
comparison in Fig. [5j 
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FIG. 5: Results using Herwig++ compared to those using Pythia from Fig. |4j The solid curve 
is the F versus A curve from shower deconstruction using events generated with Pythia; the 
dashed curve uses events generated with Herwig++. The solid circles show F versus A results 
for the top taggers using events generated with Pythia; the open squares use events generated 
with Herwig++. 



V. RESULTS FOR LOW-pr TOP JET 

While the medium region of boosted top quarks, (9(500) GeV, is a scenario most of 
the taggers we compare to are designed for, reconstructing top quarks with only a small 
boost, (9(200) GeV, is more challenging. However, reconstructing top quarks in this low-pr 
region is phenomenologically highly relevant for a large variety of standard model [H [25] 
and beyond the standard model [26H30] searches. 

Due to the smaller boost of the top quark, the decay products are widely separated. 
If the fat jet radius is not large enough to capture most of the decay products of the top 
quark, the taggers will not be able to positively identify a top jet. Therefore, a large cone 
size is necessary to reconstruct top quarks with small boost. However, this will allow a lot of 
top-uncorrelated radiation to enter the fat jet, i.e. initial state radiation and contributions 
from the underlying event. 



Compared to the scenario studied in Sec. IV we only change the fat jet algorithm and the 



related event selection cuts. We reconstruct the fat jets using again the Cambridge/Aachen 
algorithm but now with R=1.5. Events are accepted for further analysis if they provide at 
least two jets with p T j > 200 GeV each. 



We find that all taggers perform worse in this scenario compared to Sec. IV see Fig. [6} 
However, even in this challenging scenario shower deconstruction performs better than the 
other taggers. Here the relative improvements are FjH/-F s d(^4jH) ~ 4.2, i*cMs/-Fsd(^CMs) ~ 
4.6, FHEp/^sd^HEp) ~ 2.6, FNsuB/Fsd^NsuB.) ~ 11.9. The HEPTopTagger is the only 
tagger explicitly designed to work for low pt top quarks. Consequently it shows the smallest 
change in the performance ratio compared to the scenario with medium boosted top quarks. 
Crucial for a good performance in this scenario is a built-in grooming procedure which the 
CMS and JH tagger largely and the NSubjettiness tagger completely lack. Thus, particularly 
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FIG. 6: Background fake rate F as a function of signal acceptance A for shower deconstruction 
with the signal and background event samples using a 200 GeV cut on jet Pt and a fat jet cone 
size of R=1.5, as described in Sec. |Vj The curve for shower deconstruction is compared to F vs 
A points for the Johns Hopkins top tagger (JH), the top tagger of the CMS group (CMS), the 
Heidelberg-Eugene-Paris top tagger (HEP), and the use of N-subjettiness as a top tagger (NSUB). 



for the NSubjettiness tagger, one can expect a performance improvement by changing the 
top mass window. 



VI. CONE SIZE DEPENDENCE 

In this section, we study how sensitive shower deconstruction is with respect to the cone 
size and the overall amount of uncorrelated soft radiation in the fat jet. We use an event 
sample in which the fat jet is highly boosted: we require Pt > 800 GeV. We then plot in 
Fig. [7] the background fake rate versus signal acceptance for cone sizes R = 1.5, 1.25, and 
1.00. We see that the cone size makes very little difference. The larger cone sizes include 
more debris from initial state radiation, but the shower deconstruction algorithm seems not 
to be confused by this debris. 



VII. MEASURING PARAMETERS OF THE THEORY 

In many applications of shower deconstruction, some parameters of the theory for the 
sought signal events may not be known. In that case, one would like not only to show from 
the data that the sought signal is present in nature but one would also like to measure the 
unknown parameters. In the example used in this paper, suppose that we did not know the 
mass Mw of the W boson. Then we could find from the data. There is one true M\y 
in nature (80.4 GeV in our Monte Carlo event sample). However Myv is also a parameter in 
the model used in the shower deconstruction algorithm. If the model M\y is not right, then 
the shower deconstruction results should tell us. 

In a complete analysis, one would construct from the data the ratio of the likelihood that 
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FIG. 7: Shower deconstruction results for highly boosted jets (Pt > 800 GeV) showing the 
dependence on the cone size of the fat jet. The solid curve is the F versus A curve from shower 
deconstruction for fat jets defined with R = 1.50; the long dashed curve uses jets with R = 1.25; 
the short dashed curve uses jets with R = 1.00. 



the observed data is generated by the signal plus the QCD background to the likelihood 
that the data is generated by background only. Then this likelihood ratio should be small 
if the model M w is far from the true M w and should peak at M™ del = M w . 

To explore this with a simpler calculation, we have, for the event sample described in 
Sec. IV A applied shower deconstruction for a range of model Myv choices. Then we have 
calculated the background fake rate and the signal acceptance with the cut \ > 384, which 
corresponds to approximately a 20% signal acceptance when M™ del = Mw- The background 
fake rate rises slowly as M™ del increases. The signal acceptance has a peak at M™ del = Mw- 
We calculated the ratio of signal acceptance to background fake rate as a function of M™ del . 
The results are shown in Fig. [8] We see that this ratio, as expected, exhibits a peak at 
j^modei _ j\^ w _ We notice that the shape of the curve is not symmetric: a real signal 
event can look like a M™ del > Mw signal event when extraneous gluons from initial state 
radiation get counted as part of the W decay products. 



VIII. CONCLUSION AND PROSPECTS 



In this paper, we have developed an algorithm for tagging top jets based on the method 
of shower deconstruction. For this purpose we had to considerably extend the shower de- 
construction approach designed to reconstruct a Higgs boson as outlined in Ref. [12]. The 
approach models parton evolution from the hard interaction scale at which a boosted top 
quark is created down to the virtuality scale of the microjets that serve as the input to 
the calculation. For this, one needs the decay matrix elements for t — > W + b and then 
W — > q + q. Then one needs the splitting probabilities and Sudakov factors for QCD show- 
ering for the massive top quark, a massless bottom quark, and light partons created in the W 
boson decay. The splitting probabilities include appropriate factors for quantum interference 
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FIG. 8: Signal acceptance divided by background fake rate for a cut % > 384 as a function of 
the W mass, , used in the shower deconstruction algorithm. There is a peak at the true W 

mass. 

for radiation of soft gluons from color dipoles. 

We find that shower deconstruction performs significantly better than any of the publicly 
available taggers that we compared with for either a moderately boosted top quark with 
P^ et > 500 GeV or one that is only boosted to P^ et > 200 GeV. Also, we found that the 
performance of shower deconstruction is not very sensitive to the cone size used to define 
the fat jet as long as the cone size is large enough to contain the top quark decay products. 

Because shower deconstruction performs a hypothesis test for competing theories or pro- 
cesses it can be used to measure their parameters. As an example we varied the W boson 
mass in the reconstruction algorithm of the top. When the hypothesis matched nature, as 
simulated by a full event generator, the reconstruction significance was maximized, thereby 
allowing to measure the W boson's mass. 

Our subject in this paper has been limited to distinguishing top quark jets from back- 
ground jets. One can also imagine assigning a variable x to events containing multiple jets 
according to the ratio of the likelihoods that the event was produced by a signal process of 
interest or was produced by an ordinary background process. For instance, one could look 
for events produced by the decay of a new, heavy, vector boson Z' that decays to t + t. Then 
we need to distinguish such signal events from Standard Model events with two jets that 
may, or may not, be top jets. Shower deconstruction of individual jets can, we believe, be 
extended to cover event deconstruction of whole events. We leave this extension to future 
work. 
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Appendix A: Appendix 

In this appendix, we fill in some of the details about the factors that go into shower 
deconstruction for this analysis. Most of the ingredients are the same as in Ref. [12]. Thus 
we present only new features that are needed to include the decays of top quarks and W 
bosons and to include gluon radiation from the top quark. 



1. Top quark decay probability 

In a parton shower, the total probability for a splitting has the form He~ s , where H is 
the probability that the parton splits at shower time t and e~ s is the probability that it has 
not split at an earlier shower time. We can formulate parton decay in the same way. For 
the decay, let us denote He~ s = H. Then for a top quark decay, we take 

rr =r 27rm t r t Q(\p 2 - m 2 \ < m t A t ) 
* t 2arctan(A t /r t ) (p? - m 2 ) 2 + m 2 T 2 t ' 1 J 



Inm 2 



where 

C t = • (A2) 

m t ~ m w 

The main feature of this is the standard Breit-Wigner denominator, (p 2 —m 2 ) 2 +m 2 T 2 , where 
T t is the decay width. 2 For shower deconstruction, we supply an extra factor, Q(\p 2 — m 2 \ < 
M t A t ), where A t is greater than T t . We insert this factor as an approximation in order to 
eliminate entirely shower histories for which H would be small. There are two normalization 
factors. One is fixed by 

dp 2 2nm t T t 9Q 2 - m 2 \ < m t A t ) = 

2tt 2arctan(A t /r t ) (p 2 - m 2 ) 2 + m 2 T 2 ' 1 } 

The second, C t , is fixed by 

-3 f d Pb 



Ct {2n) J 2^ {2n) J 2^ (2?r) iPb + Pw ~^ = l ( A4 ) 

as long as p 2 t = m 2 and p 2 ^ = m 2 ^ are good approximations. Together, these normalization 
factors insure that the top quark decays with probability 1. 

Now we need the Sudakov exponent S t (t, t Q ), which is the integral of H / C t from a starting 
shower time to defined by the previous splitting to a shower time t related to \p 2 — m 2 \ 
according to Eq. ([5]). If we define 

We choose the simulated width I\ larger than the physical top quark width in order to approximately 
simulate an imperfect resolution in measuring jet momenta. See Sec. 
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IV B 



then S t (t, to) = S t (t, t min ) — S t (to, t m i n ). Taking into account the jacobian to change integra- 
tion variables from p t to t, we get 



log [arctan(A t /r t )] — log 



m t T t 



(A6) 



Having found S t , we immediately obtain the decay function without the Sudakov factor, 



H t = C t 



27rm t r t 



6(> 2 -m t 2 | < m t A t ) 



2 arctan(|p 2 — m 2 |/[m t r t ]) (p 2 — m 2 ) 2 + m 2 r 2 



(A7) 



We have so far considered top quark decay in isolation. However, a top quark carries 
color and therefore can emit a gluon. In any interval dt of shower time, the top quark can 
either emit a gluon or decay. The top quark emits a gluon with a probability determined by 



a splitting function H ttg that we will discuss in section A 3 The gluon emission process has 
its own Sudakov exponent, Sttg- The probability that the top quark has neither emitted a 
gluon nor decayed by shower time t is given by the sum of the Sudakov exponents, S t + S t t g - 



2. W boson decay probability 

The W boson created in the top quark decay will itself decay to a quark q and an antiquark 
q. For the total splitting probability H = He~ s , we take 

~ 27rm w r w 6(|Pw _ m wl < rawA w ) f \ IKQ , 

HW = <?xrrtw(A 7f ^ (r? in ^ ^ i TO 2 F 2 9w[p q ,Pq,Pt) ■ (A8) 

Z arctan^Aw/1 W J [p w — m w J z + m w l w 

This is like the decay probability for the top quark except that now we have an extra function 
g. The W has spin 1 and it is polarized. That is, it has a non-trivial spin density matrix. 
That happens because of the decay process that created the W. The W-polarization leads to 
an angular dependence of the decay products' momenta as seen in the W rest frame. This 
angular dependence is represented by the function g^. Since the polarization arises from 
the top decay, <?w depends on p t . Specifically, 

W (ml - m 2 v )(m t 2 + 2ml^) ' 

Since the W boson is colorless, there is not a competition between W decay and gluon 
emission. For this reason, it is enough to represent the total probability for the W to decay 
by i?w without separately using a Sudakov factor. 



3. Top quark splitting function 



The top quark can emit a gluon. The splitting function for this, H ttg , differs from a light 
quark splitting function because of the top quark mass. Following closely the reasoning in 
Ref. [T2"l. we take 



tt g 



87rCFCt s k j 



1 + 



kj 



9{Pg,Pt,Pk) © 



2 A_ < 



k 



J 



k 



K 



(A10) 
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Here J refers to the mother top quark and ji 2 = p 2 —m 2 . Then t and g refer to the daughter 
top quark and the daughter gluon, respectively, and k t and k g are their transverse momenta. 
If we denote k g ~ (1 — z)kj and k t ~ zkj, we recognize the familiar collinear splitting 
function (l+z 2 ) /(l — z) in i7 ttg . There is a theta function that enforces ordering of the shower 
emissions in shower time. In this theta function, n 2 K denotes the virtuality in the previous 
emission from the top quark and kx denotes the transverse momentum of the top quark 
just before this emission. In a strongly ordered shower, we would have fJ^/kj <C [i 2 K /kK- 
In our shower, we settle for a factor of 2 between these scales. In the case that there was 
no previous splitting, the theta function in H is not needed and we ignore it, while in the 
corresponding calculation of the Sudakov exponent we replace n 2 K /k K — > 2(k 2 + m 2 )/kj in 
the theta function. 

When the emitted gluon is soft, there is quantum interference between emission of the 
gluon from the top quark and emission from some other (massless) parton k that is color 
connected to the top quark. We partition the emission probability from the whole dipole 
into two terms, one of which looks mostly like emission from the top quark and the other of 
which looks mostly like emission from parton k. The term that looks mostly like emission 
from the top quark is H ttg . The influence of the color connected partner is seen in the 
function g(p g ,Pt,Pk), which is 

9{P9 ^ Pk) ~ -fit (P 9 - Pt p 9 -P k ) 2 Ak • (AU) 

The first factor here is simply the inverse of the soft gluon limit of the factors that we have 
included in the collinear part of H ttg . The second factor is the squared matrix element 
for emission of a soft gluon with momentum p g from a dipole consisting of partons with 
momenta p t and pk- The third function is a function A' tk that serves to partition the dipole 
squared matrix element into the two terms mentioned above. There is some arbitrariness in 
choosing this function. We take 

A' = Pg ' Pk kt . (A12) 

After expanding the factors here, we have 

k g 2p g -p tP fp k -m 2 p g -p k 

9{Pg,Pt,Pk) = ^— 5 7— 1 • Al 3 

Zpg-Pt Pg-Pk kt+p g -pt h 

It is convenient to write this in terms of the angles between the partons, using the approxi- 
mation that these angles are small. Using rapidities y and azimuthal angles of the partons, 
we define 

0% = (y 9 - ytf + (<P 9 - <Pt) 2 , 

Q 2 gk = {yg-ykY + ^g-^k)\ (Ai4) 
ol = (y t - y k f + {<t>t - <t>kf ■ 

Then for small angles and m 2 /kf C 1 we have the function g in the form in which we use 
it to compute H ttg : 

, , _ K + m 2 /k 2 )(9 2 k + m 2 /k 2 ) - [mljk 2 ) 9 2 k 

9[P ^ Pk) ~ {91 + m 2 /k 2 W gk + 91 + m 2 lk 2 ) ' (A15) 

Notice that, by construction, g is not singular when 9 2 k — > 0. 
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4. Massless parton splitting functions 



We treat all quarks except for the top quark as being massless. When a massless quark 
splits by emitting a gluon, the splitting function is 



H, 



qqg 



87rC-Fa s kj 



1 + 



g( Pg , Pq ,p k )e(2^<^ 



(A16) 



Here J refers to the mother quark and //j = p 2 j- Then q and g refer to the daughter quark 
and the daughter gluon, respectively, and k q and k g are their transverse momenta. When a 
gluon splits by emitting a gluon, the splitting function is 



H, 



999 



87rCAa s 



k s kh 



k s k}i 



g(ps,Ph,Pk)Q 



2 /fj < 



k 



K 



(A17) 



Now h and s refer to the daughter gluon with the greater transverse momentum kh and 
the daughter gluon with the smaller transverse momentum k s , respectively. For the gluon 
splitting, if we approximate k s /kj = 1 — z and kh/kj = z, we see that H ggg contains a 
collinear splitting factor [1 — z(l — z)] 2 /[z(l — z)], in contrast to the quark splitting factor 
[1 + z 2 ]/(l — z). In both H qqg and H ggg , there is a theta function that enforces ordering of 
the shower splittings in shower time, as in the previous subsection. Except for the function 
g, these are the same functions H that we used in Ref. [T2] , 

There is also a function g. When the emitted gluon is soft, there is quantum interference 
between emission of the gluon from parton J and emission from some other parton k that is 
color connected to the splitting parton. We partition the emission probability from the whole 
dipole into two terms, as in the previous subsection. The influence of the color connected 
partner is seen in the function g. This is the same function for emission from a quark and 
emission from a gluon, but with different variable names. With the same logic as in the 
previous section, we have 



g(Pg,P q ,Pk) 



(0 2 gk + ml/kl)(6 2 qk + ml/kl) - (ml/kl) 



9 



(0 2 gk + ml/klW gq + 9l k 



ml/k 



(A18) 



This is the same function that we used in Ref. [Tz] except that here the color connected 
partner k could be massive because it could be the top quark. 



5. Dipole antenna splitting 

In shower II, a massless parton can emit a gluon with the participation of a color connected 
parton that is the top quark just before its decay. In this case, a color dipole emits the gluon 
and we do not partition the emission into two pieces. Rather, we consider the dipole to be a 



unit, sometimes called a dipole antenna. The splitting function is then given by Eq. (A16) 



or Eq. (A17), depending on whether the emitting parton is a quark or a gluon. The only 
difference with the preceding section is that now we omit the partitioning function A' qk or 
A' hk . With this choice, the angular function is 



+ m 2 / kl)(9l k + mj/kl) - (ml/kl) 9] 
' 9 k + ml/klf 



9(Pg,P q ,Pk) = — jA ^o,v.2\2 ~ • ( Al9 ) 
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Here parton k is the top quark, so = m t . Notice that if we were to set m& to zero, this 
function would be singular when 2 gk — > 0. That is the consequence of omitting A' qk . Because 
m t > 0, there is no singularity. 



6. Sudakov exponents 

For each propagator in a shower history diagram, there is a Sudakov factor e~ s . This 
factor gives the probability for the parton not to have split between the shower time of its 
previous splitting and the shower time of the next splitting. If there is no next splitting, 
then e represents the probability not to have split between the previous splitting and the 
shower time that corresponds to the microjet virtuality. The top quark can either split or 
decay, so there are two contributions to S. The W boson can only decay, so we simply 
include e~ s in the function H that gives the differential decay probability. 

We calculate Sudakov exponents for QCD splittings using 

S = J dfi e(/4 in < /4) Jdzjdcp H(p a ,p g ) . (A20) 

Here /i^ in is the virtuality of the parton splitting. There is a /i^ ax corresponding to the 
shower time of the previous splitting. The constraint fij < is included in the splitting 
function H. The splitting functions H are given in Ref. [12] and in the preceding subsections. 
The variable z is the momentum fraction of the splitting and <p is the azimuthal angle of 
the plane of the splitting about the direction of the mother parton. 

We need to express 5* as a quickly computable function of the variables in the shower 
history. Thus we cannot use numerical integration to evaluate the integrals in the definition 



(A20). On the other hand, the integrals are too complicated to evaluate analytically. For 
that reason, we have developed simple numerical approximations to the integrals and we 
use these approximate functions. The approximations used are not really an essential part 
of the physics: the ones that we use currently are different from those used in Ref. [12] and 
if we found better approximations, we would use them. For that reason, it does not seem 
useful to list the approximate functions used to represent the functions S. 
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